A Graph Partitioning Approach to Entity Disambiguation Using Uncertain Information
نویسندگان
چکیده
This paper presents a method for Entity Disambiguation in Information Extraction from different sources in the web. Once entities and relations between them are extracted, it is needed to determine which ones are referring to the same real-world entity. We model the problem as a graph partitioning problem in order to combine the available information more accurately than a pairwise classifier. Moreover, our method handle uncertain information which turns out to be quite helpful. Two algorithms are trained and compared, one probabilistic and the other deterministic. Both are tuned using genetic algorithms to find the best weights for the set of constraints. Experiments show that graph-based modeling yields better results using uncertain information.
منابع مشابه
Collective Named Entity Disambiguation using Graph Ranking and Clique Partitioning Approaches
Disambiguating named entities (NE) in running text to their correct interpretations in a specific knowledge base (KB) is an important problem in NLP. This paper presents two collective disambiguation approaches using a graph representation where possible KB candidates for NE textual mentions are represented as nodes and the coherence relations between different NE candidates are represented by ...
متن کاملQuery Translation Disambiguation as Graph Partitioning
Resolving ambiguity in the process of query translation is crucial to cross-language information retrieval when only a bilingual dictionary is available. In this paper we propose a novel approach for query translation disambiguation, named “spectral query translation model”. The proposed approach views the problem of query translation disambiguation as a graph partitioning problem. For a given ...
متن کاملWord Sense Disambiguation: A Graph-Based Approach Using N-Cliques Partitioning Technique
This paper presents a new approach to solve semantic ambiguity using an adaptation of the Cliques Partitioning Technique to N distance. This new approach is able to identify sets of strongly related senses using a multidimensional graph based on different resources: WordNet Domains, SUMO and WordNet Affects. As a result, each Clique will contain relevant information used to extract the correct ...
متن کاملAn Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملA multi agent method for cell formation with uncertain situation, based on information theory
This paper assumes the cell formation problem as a distributed decision network. It proposes an approach based on application and extension of information theory concepts, in order to analyze informational complexity in an agent- based system, due to interdependence between agents. Based on this approach, new quantitative concepts and definitions are proposed in order to measure the amount of t...
متن کامل